韩国专利KR20020011408A Method of profiling disparate communications and signal processing standard and services

专利PDF首页>>韩国专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
The method for profiling heterogeneous communication and signal processing standards begins with selecting a set of standards for analysis (32). Next, the functions performed by the standard set are identified (34) and ranked (36). A set of high ranking functions is implemented as a kernel (38), and the set of kernels form a programmable processor (42) that enables the implementation of one of a set of communication and signal processing standards.
公开号:KR20020011408A
申请号:KR1020017014175
申请日:2000-05-05
公开日:2002-02-08
发明作者:서브라마니안래비
申请人:모픽스 테크놀로지 아이엔씨；
IPC主号:

专利说明:

METHOOD OF PROFILING DISPARATE COMMUNICATIONS AND SIGNAL PROCESSING STANDARD AND SERVICES}
[2] Signal processing protocols and standards are expanding with the development of wireless communication devices and services. Current communication protocols include frequency division multiplexing (FDM), time division multiple access (TDMA), and code division multiple access (CDMAA). The United States, Europe, Japan, and Korea have all developed their own standards for each communication protocol. TDMA standards include Interlim Standard-136 (IS-136), Global System for Mobile Systems (GSM), and General Packet Radio Service (GPRS). CDMA standards include Global Positioning System (GPS), Interlim Standard-95 (IS-95), and Wideband CDMA (WCDMA). Wireless communication services include paging, voice and data applications.
[3] Until recently, wireless communication devices supported a single communication standard. In theory, a wireless communication device can be designed using a general purpose digital signal processor (DSP) to be programmed to realize a set of functional blocks that specify minimum performance requirements for an application. To achieve these minimum performance requirements, system designers design algorithms (arithmetic sequences, trigonometric, logic, control, memory access, indexing operations, etc.) to encode, logic, and decode signals. These algorithms are typically specified in software. The set of algorithms that achieve a target performance specification is referred to collectively as an executable specification. This executable specification is typically compiled and executed on the DSP through the use of a compiler. Despite the increasing computational power, speed, and decreasing memory cost and size of general-purpose DSPs, designers have been unable to meet cost, performance, and speed requirements by programming general-purpose DSPs into executable specifications for standard-specific applications.
[4] Additional dedicated high-speed processing is required, and needs traditionally achieved using application specific processors. As used herein, an application specific processor is an excellent processor in the efficient execution (performance, scope, flexibility) of a set of algorithms tailored to the application. Application-specific processors do not do much for algorithms outside their intended application space. In other words, the improved speed and performance efficiency of an application-specific processor sacrifices functionality flexibility.
[5] There is now a need for a wireless communication device that supports wireless communication, and a changing class of services across multiple standards. Today's solution to this problem necessarily involves connecting multiple application-specific processors to achieve multiple standard operations, which adds cost in terms of design resources, design time, and silicon area. 1 is a block diagram of a wireless communication device designed according to this approach. 1 includes a microcontroller 20 and a DSP 22 that access memory 24. This wireless communication device also includes a set of application specific fixed function circuits 26A-26D including AMPS circuits 26A, CDMA circuits 26B, IS-136 circuits 26C, and GSM circuits 26D. do.
[6] In view of the above, by providing a technology for filing heterogeneous communication and signal processing standards to facilitate the implementation of a signal processor and to provide heterogeneous communication and signal processing standards in a cost, area and performance efficiency method, It is extremely desirable to remove the processor.
[1] The present invention relates generally to the design of multi-functions digital devices. More specifically, the present invention relates to disparate communication and profiling techniques for signal processing standards and services to facilitate the development of application specific processors.
[9] In order to better understand the present invention, reference should be made to the following detailed description with reference to the accompanying drawings.
[10] 1 illustrates a prior art communication and signal processing system utilizing a set of application specific processors.
[11] FIG. 2 illustrates profiling communication and signal processing functions across multiple standards in accordance with an embodiment of the present invention. FIG.
[12] 3 is a block diagram of canonial functions of a receiver.
[13] 4 illustrates a set of subfunctions for implementing a parameter estimator.
[14] FIG. 5 illustrates a table for ranking sub-functions according to computational strength. FIG.
[15] 6 illustrates a kernel for implementing a function.
[16] 7A is a first partial diagram of a method for identifying components of an add-compare-selection loop of the Viterbi algorithm.
[17] FIG. 7B is a first partial view of a method for identifying components of an add-compare-select loop of the Viterbi algorithm. FIG.
[18] 7C is a third partial diagram of a method for identifying components of an add-compare-select loop of the Viterbi algorithm.
[19] 8 illustrates a method for identifying a critical sequence of operations for a finite impulse response filter (FIR).
[20] 9 illustrates a process for profiling a standard function.
[21] 10 illustrates a programmable multi-standard application specific process.
[22] 11 illustrates an example of the required programmable interconnect between kernels for a given application.
[23] Like reference numerals refer to corresponding drawings throughout the drawings.
[7] The method of the present invention defines a programmable processor that can be programmed to profile heterogeneous communication and signal processing to execute a heterogeneous communication signal processing standard. The method includes selecting a set of communication and signal processing standards for analysis and identifying a function common to the selected set of communication and signal processing standards. The common functions are then ranked according to the computation strength. Using this ranking, a set of high computational strength functions is selected to implement as a kernel, which sets the programmable processor on which any set of communication and signal processing standards can be implemented.
[8] The present invention enables identification of optimal datapaths and control state machines for use in the design of application specific processors. This method can be used to identify functions that are tinkerly executed by existing microprocessors and digital signal processors. This technique can also define new datapaths and state machines needed to implement functions efficiently. The method of the present invention provides a systematic method for analyzing functions across many applications or standards, reducing the time to define processor architectures, and reducing the amount of design reuse available in the design of new processors for digital signal processing in multiple standard applications. Increase.
[24] 2 illustrates the steps of the method of the present invention for profiling and analyzing functions over many signal processing applications to design a processor that can be programmed to efficiently execute an algorithm associated with a profiled signal processing standard or application. 30 is shown. The process of Figure 2 reduces the time to define the processor architecture and increases the amount of design reuse that is possible in the design of new processors for digital signal processing in many standard applications. In short, the method of the present invention begins with the selection of a set of communications and signal processing standards and services for analysis. Next, the functions common to the selected set of communications and signal processing standards are identified. The common functions are then ranked according to the computational strength and a high set of computational strength functions are chosen to implement as a programmable kernel, which forms a programmable multi-standard processor.
[25] First, during step 32, a set of communication and signal processing standards is selected for analysis from a set of possible standards. Although any set of standards can be selected in accordance with the present invention, the selected standards can be affected by the target kernel for the programmable processor being designed. For example, the target market may be a manufacturer of a wireless mobile device intended for sale in Japan.
[26] A. Identify common canonial functions
[27] Still referring to FIG. 2, a set of communication and signal processing standards is selected, and a set of common functions is identified for the selected application during step 34. By way of example, FIG. 3 shows a function block when the selected application is the baseband processor 51 of the receiver. The function blocks to be implemented are digital front-end processor 52, detector / demodulator 54, symbol decoder 56, source decoder 58, and parameter estimator 60. For each of the function blocks of the baseband processor 51, each selected communication and signal processing standard will specify a number of subfunctions. For example, consider 4, which shows a set of sub-functions in table form, to implement a parameter estimator 60 for multiple standards. Many parameter estimation subfunctions are common to many standards. For example, IS-136, GSM, GPRS, EDGE, IS-95B, IS-2000, and WCDMA-FDD all use a Windows average energy estimator.
[28] B. Rank function
[29] FIG. 2 shows that during step 36, function blocks are ranked to identify functions that are not suitable for implementation through programming of a general purpose DSP. In other words, functions are ranked to identify the functions that are suitable for implementation across application-specific multi-standard processors. This is a multi-step process that begins with generating an executable specification for each function across selected communication and signal processing standards. Preferably, the executable specification is coded using the C or C ++ language. The executable specification for each standard is ranked using many matrices. One useful matrix is the computational intensity of each function. The computation strength of each function may be determined using dynamic profiling of each executable specification to quantize the associated number of million-of-operations-per-second (MOPS). This can be done through simulation and automated test benches. The result can be presented as a table demonstrating which functions have the best MOPS. Such characterization may be made in connection with a general purpose processor or a specific digital signal processor or microprocessor. If the characterization is made with respect to a particular processor, the executable specification must be executed on that process for profiling purposes. The resulting table from this implementation illustrates a function in which the instruction set architecture, datapath, or memory bandwidth of the native processor is not necessarily suitable.
[30] FIG. 5 shows a portion of such a table including MOPS for a single standard and a subset of subfunctions of baseband processor 51 (see FIG. 3). The computational strength of each subfunction is indicated for a subset of the channels supported by the baseband processor 51. 5 shows that the receive (Rx) filter is the most associatively strong of the enumerated subfunctions, and therefore best suited for implementation in a programming application specific processor. 5 also indicates that the complex spreader is computationally counting and suitable for a programmable application specific processor. Other subfunctions that are computationally counting, but not shown in FIG. 5, are a RAKE receiver, turbo coder, inference canceller, multi-user detector, and search.
[31] Other matrices that can be used to rank the functions across a selected set of communication and signal processing standards include power consumption and silicon area. Determining the power consumption of each function requires identifying the amount of time spent by the function for each set of operation types. The set of operation types includes move-and-transfer, loop-and-control, trigonometric and arithmetic. Each kind of operation consumes some mW per operation. Thus, the total power consumption of each function, and the number of operations of each kind, can be determined over a selected set of communication and signal processing standards. This analysis is likely to reveal that rake receivers tend to consume large amounts of power compared to other subfunctions. The silicon area required to store executable code is selected by counting the number and type of operations required for each executable specification, and then using a cost table showing the cost in silicon area for each operator. Can be estimated for each function across the communication and signal processing standards. Once again, the rake receiver may require more gates to store executable code than other subfunctions.
[32] After the functions are ranked using the selected set of matrices during step 38 (see FIG. 2), a set of higher rank functions is selected for implementation and further analysis.
[33] C. Analysis and Delegation of High Rank Functions
[34] Referring again to FIG. 2, during step 40, the selected set of functions are analyzed across multiple standards to identify a computational kernel that is common across all instances of the function. (As described herein, the kernel refers to a sequence of operations that can be represented by a control-dataflow graph and implemented in software or hardware. Figure 6 shows three modules: sequencer 66, local. Kernel 65, which includes memory 67 and parameterizable and configurable arithmetic logic unit 68, is shown in block form.) In other words, during step 40, an application-centric process for profiling functions. Rather, a function-centric approach is taken.
[35] Profiling of functions begins with the executable specification of each "standard-specific" version of the function, and simulation to optimize all signal and variable word widths. Profiling of the functions includes identifying a critical sequence of operations. The sequence of operations may include move-and-transfer, loop-and-control, trigonometric or arithmetic operations. As described herein, a threshold sequence of operations or components is a sequence of operations that require proper completion to perform a standard function at a given time. By way of example, FIGS. 7A-7C illustrate a method of identifying components of an add-compare-select loop of a machine implemented Viterbi algorithm. Machine-implemented Viterbi algorithms are dynamic programming algorithms used in digital communications to obtain the closest sequence of symbols sent in a digital transmission system. 7A illustrates the first two steps of a computer implemented Viterbi algorithm. 7B shows a third phase of the add-compare cycle, including a machine implemented Viterbi algorithm, a computation stage, and a survivor storage stage. 7C shows the data flow and control flow of the add-compare-selection cycle of a computer implemented Viterbi algorithm. 7C shows the relationship between a loop having a sequence of operations used during a cycle, and a sequence of operations during one iteration of a computer implemented Viterbi algorithm.
[36] As another example of a method for identifying a component of a standard function, FIG. 8 illustrates a machine implemented method for identifying a threshold operation sequence for a finite impulse response (FIR) filter. The equation shown mathematically illustrates the convolution of the input sequence x (n) with the set of filter coefficients a (n). The structure shown below the equation in FIG. 8 shows the most common subset of data flow and control flow operations in the realization of FIR. Highlighted in FIG. 8 shows all the operations required for a single stage of the FIR.
[37] After profiling a function, the standard functions are analyzed across multiple standards to identify components that are common across all instances of the function and variable components. The process of standard function profiling can be best understood with reference to FIG. At the bottom of FIG. 9, a set of independent standards for wireless applications including GPS, IS-95, CDMA, W-CDMA, IS-136 TDMA and GSM are listed. The function profile for the particular application, in this case the baseband processor 51, is listed on the left side of FIG. 9. Standard functions of the baseband processor 51 include an MPSK frequency estimator, a convolutional decoder, a rake receiver and an MLSE equalization unit.
[38] Fig. 9 shows in rectangular form the set of functional components 70a-g, 72a-d, 74a-d and 76a-d constituting each standard function. Each set of rectangular function components is divided as a plurality of square roots where each rectangle represents a single component 71, 73. Although the set of functional components 70, 72, 74, 76 is shown as including six components 71, 73, the number of components 71,73 per set of functional components is the respective standard. It depends on the function. For each set of functional components 70, 72, 74, 76, any number of components 71, 73 are included for illustrative purposes. In FIG. 9, component 73, which is common to all function component sets for the standard functions, is white, while other components 76 are black. Any number of variables and common components are shown. Analysis of the set of functional components 70a-70d for the MPSK frequency estimator shows three components 73 common to all CDMA standards and three components 71 that vary according to the CDMA standard. This indicates that a single set of kernels can be designed to support all CDMA standards if the kernel set is partially programmable to allow implementation of the variable component 71. Similarly, the result of analyzing the functional component set 70e-90g shows three components 73 common to all TDMA standards and three components 71 that vary according to the TDMA standard. This allows a single kernel set to be designed to support all profiled TDMA standards, if the kernel set is partially programmable. (Partial programmability is necessary to enable the implementation of the variable component 73.) Instead, according to profiling, a single set of partially programmable kernels 78 is a set of all CDMA and TDMA function components. Can be designed to support 70a-g. Analysis of the set of function components associated with other standard functions leads to similar conclusions. In other words, a single set of partially programmable kernels 82 can be designed to support all sets of function components 72a-72d associated with convolutional decoder functions, and a single set of partially programmable kernels 84 Can be designed to support a set of function components 74a-74d associated with a rake receiver function, and a single set of partially programmable kernels 86 support a set of function components 76a-76d associated with an MLSE equalization function. It may be designed to.
[39] For functions with extensive overlap during step 42 (see FIG. 2), each partially programmable set of kernels is designed to have fixed and programmable units. As described with reference to FIG. 6, the kernel 65 includes three modules 66, 67, and 68 that make up a computing unit. Preferably, sequencer 66 and ALU 68 are partially programmable. Thus, such programmable portions of sequencer 66 and ALU 68 form a programmable computing unit, while memory 67 and fixed portions of sequencer 66 and ALU 68 form fixed computing units. do. By programming the programmable units of the kernel, all of its components 71, 73 can be realized.
[40] Referring again to FIG. 9, a set of partially programmable kernels 78, 82, 84, 86 enables the creation of multiple standard, protocol-specific engines 90, 94. Engine 90 is a standards independent CDMA-specific processor that includes a partially programmable set of kernels for each standard function of the application. Thus, engine 90 may include, for example, a partially programmable set of kernels 78, 82, 84, 86. Similarly, engine 92 is a standards independent TDMA-specific processor that includes a partially programmable set of kernels for each standard function of the application. Incidentally, given a partially programmable set of kernels for each standard function, an engine 94 may be designed that is independent of multiple standard protocols.
[41] 10 illustrates a programmable, multi-standard, application-specific processor 100 in block form. The processor 100 includes a program control unit 102, a kernel bank 104, and a reconfigurable data router 106. The program control unit 102 controls the programming of the kernel bank 104 and the reconfigurable data router 106 such that the processor 100 can be configured to support any one of the supported sets of standards. The program control unit 102 includes a memory 110 that stores the senior code for programming the controller 102 and the bus manager 114. The controller 112 controls the programming of the programmable unit within each kernel of the kernel bank 104, while the bus manager 104 controls the configuration of the reconfigurable data router 106. Kernel bank 104 includes a plurality of kernels, each for each standard function of the application. Reconfigurable data router 106 routes data between kernels to implement applications in accordance with certain standards. Reconfigurable router 106 need not be fully programmable. 11 shows an example of an interconnection between kernels that must be programmable for a given application. The kernels of the applications are listed both above and to the left of FIG. The interconnects that should be supported for the application are marked with x. For each kernel, there are relatively few interconnects that must be supported. For example, the turbo decoder kernel only needs to be able to connect to the convolutional decoder core unit and the memory management unit kernel.
[42] Those skilled in the art will appreciate that the present invention provides a systematic method of designing a processor for multiple standards, multiple functions, and multiple parathemes. In addition, the techniques of the present invention provide for function profiling and datapaths. Definition, and a control state machine engine that can be reused across many processors reduces processor design cycle time.
[43] For purposes of explanation, the description uses specific nomenclature to provide a thorough understanding of the present invention. However, those skilled in the art will appreciate that no specific details are required to practice the invention. In other instances, well-known circuits and devices are shown in block form in order to avoid unnecessary confusion from the present invention. Accordingly, the foregoing descriptions of specific embodiments of the present invention have been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the invention to the precise form disclosed. The present embodiments have been selected and described in order to best illustrate the present invention and its applications, so that those skilled in the art can best utilize the present invention and its various embodiments, which have been variously modified to suit particular uses. It is intended that the scope of the invention be defined by the following claims and their equivalents.

权利要求:
Claims (5)
[1" claim-type="Currently amended] A method for profiling disparate communication and signal processing standards,
Selecting a set of communication and signal processing standards for analysis;
Identifying functions performed by the set of communication and signal processing standards;
Ranking the functions according to computational intensity; And
Selecting a set of high computational strength functions to implement as a kernel
How to include.
[2" claim-type="Currently amended] The method of claim 1,
Profiling the high computation strength functions across the communication and signal processing standard set to identify a common set of computation sequences and a variable set of computation sequences; And
Defining each kernel as including a fixed computation unit that implements a common set of operations sequences, and a programmable unit that implements a variable set of operations sequences
Further comprising wherein the kernel is programmable to implement any one of the set of communication and signal processing standards.
[3" claim-type="Currently amended] A method for profiling disparate communication and signal processing standards,
Selecting a set of communication and signal processing standards for analysis;
Identifying functions performed by the set of communication and signal processing standards;
Ranking the functions according to a set of metrics; And
Selecting a set of high ranking functions for implementation in a programmable processor
How to include.
[4" claim-type="Currently amended] The method of claim 3,
Profiling the high rank functions across the communication and signal processing standard set to identify a common set of operation sequences and a variable set of operation sequences;
Defining a kernel for each high rank function, each kernel including a fixed operation unit implementing a common set of said operation sequences, and a programmable unit implementing a variable set of said operation sequences; And
Defining the programmable processor as including kernels for the high rank functions
Wherein the kernels are programmable such that the programmable processor implements any one of the communication and signal processing standards set.
[5" claim-type="Currently amended] 4. The method of claim 3, wherein the set of matrices includes at least one of computational intensity, power consumption, and silicon area.

类似技术:

公开号 | 公开日 | 专利标题

US9594723B2|2017-03-14|Apparatus, system and method for configuration of adaptive integrated circuitry having fixed, application specific computational elements

US9164952B2|2015-10-20|Adaptive integrated circuitry with heterogeneous and reconfigurable matrices of diverse and adaptive computational units having fixed, application specific computational elements

Woh et al.2009|AnySP: anytime anywhere anyway signal processing

Woh et al.2008|From SODA to scotch: The evolution of a wireless baseband processor

US9396161B2|2016-07-19|Method and system for managing hardware resources to implement system functions using an adaptive computing architecture

US6618434B2|2003-09-09|Adaptive, multimode rake receiver for dynamic search and multipath reception

KR101330059B1|2013-11-18|Programmable digital signal processor having a clustered simd microarchitecture including a complex short multiplier and an independent vector load unit

KR101256851B1|2013-04-22|Digital signal processor including a programmable network

JP6526415B2|2019-06-05|Vector processor and method

US6442672B1|2002-08-27|Method for dynamic allocation and efficient sharing of functional unit datapaths

JP4326953B2|2009-09-09|System for the construction and operation of an adaptive integrated circuit with fixed application-specific computing elements

Rauwerda et al.2007|Towards software defined radios using coarse-grained reconfigurable hardware

KR101394573B1|2014-05-12|Programmable digital signal processor including a clustered simd microarchitecture configured to execute complex vector instructions

Cardoso et al.2002|XPP-VC: AC compiler with temporal partitioning for the PACT-XPP architecture

US8510534B2|2013-08-13|Scalar/vector processor that includes a functional unit with a vector section and a scalar section

Lin et al.2006|Soda: A low-power architecture for software radio

US7043682B1|2006-05-09|Method and apparatus for implementing decode operations in a data processor

JP3609513B2|2005-01-12|Microprocessor

Kissler et al.2006|A highly parameterizable parallel processor array architecture

JP5075313B2|2012-11-21|Method of generating a configuration for a configurable spread spectrum communication device

JP6059413B2|2017-01-11|Reconfigurable instruction cell array

US9002998B2|2015-04-07|Apparatus and method for adaptive multimedia reception and transmission in communication environments

US6912706B1|2005-06-28|Instruction processor and programmable logic device cooperative computing arrangement and method

KR101842061B1|2018-05-14|Vector processing engines employing a tapped-delay line for filter vector processing operations, and related vector processor systems and methods

RU2147378C1|2000-04-10|Special-purpose processor

同族专利:

公开号 | 公开日

EP1177700A1|2002-02-06|

JP2003527768A|2003-09-16|

CA2371140A1|2000-11-16|

KR100743882B1|2007-07-30|

WO2000069192A9|2002-06-13|

AU5127200A|2000-11-21|

WO2000069192A1|2000-11-16|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

法律状态:
1999-05-07|Priority to US13313099P

1999-05-07|Priority to US60/133,130

2000-05-05|Application filed by 모픽스 테크놀로지 아이엔씨

2002-02-08|Publication of KR20020011408A

2007-07-30|Application granted

2007-07-30|Publication of KR100743882B1

优先权:

申请号 | 申请日 | 专利标题

US13313099P| true| 1999-05-07|1999-05-07|

US60/133,130|1999-05-07|

[返回顶部]